Class Conditional Nearest Neighbor and Large Margin Instance Selection

نویسنده

Elena Marchiori

چکیده

The one nearest neighbor (1-NN) rule uses instance proximity followed by class labeling information for classifying new instances. This paper presents a framework for studying properties of the training set related to proximity and labeling information, in order to improve the performance of the 1-NN rule. To this aim, a so-called class conditional nearest neighbor (c.c.n.n.) relation is introduced, consisting of those pairs of training instances (a, b) such that b is the nearest neighbor of a among those instances (excluded a) in one of the classes of the training set. A graph-based representation of c.c.n.n. is used for a comparative analysis of c.c.n.n. and of other interesting proximity-based concepts. In particular, a scoring function on instances is introduced, which measures the effect of removing one instance on the hypothesis-margin of other instances. This scoring function is employed to develop an effective large margin instance selection algorithm, which is empirically demonstrated to improve storage and accuracy performance of the 1-NN rule on artificial and real-life data sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminative Learning of the Prototype Set for Nearest Neighbor Classification

The nearest neighbor rule is one of the most widely used models for classification and selecting a compact set of prototype instances is an important problem for its applications. Many existing approaches on the prototype selection problem rely on instance-based analyses and local criteria on the class distribution, which are intractable for numerical optimization techniques. In this paper, we ...

متن کامل

Liquid-liquid equilibrium data prediction using large margin nearest neighbor

Guanidine hydrochloride has been widely used in the initial recovery steps of active protein from the inclusion bodies in aqueous two-phase system (ATPS). The knowledge of the guanidine hydrochloride effects on the liquid-liquid equilibrium (LLE) phase diagram behavior is still inadequate and no comprehensive theory exists for the prediction of the experimental trends. Therefore the effect the ...

متن کامل

Large Margin Subspace Learning for feature selection

Recent research has shown the benefits of large margin framework for feature selection. In this paper, we propose a novel feature selection algorithm, termed as Large Margin Subspace Learning (LMSL), which seeks a projection matrix to maximize the margin of a given sample, defined as the distance between the nearest missing (the nearest neighbor with the different label) and the nearest hit (th...

متن کامل

Time Series Classification by Class-Based Mahalanobis Distances

To classify time series by nearest neighbor, we need to specify or learn a distance. We consider several variations of the Mahalanobis distance and the related Large Margin Nearest Neighbor Classification (LMNN). We find that the conventional Mahalanobis distance is counterproductive. However, both LMNN and the class-based diagonal Mahalanobis distance are competitive.

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Class Conditional Nearest Neighbor and Large Margin Instance Selection

نویسنده

چکیده

منابع مشابه

Discriminative Learning of the Prototype Set for Nearest Neighbor Classification

Liquid-liquid equilibrium data prediction using large margin nearest neighbor

Large Margin Subspace Learning for feature selection

Time Series Classification by Class-Based Mahalanobis Distances

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

عنوان ژورنال:

اشتراک گذاری